{
  "cells": [
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "%matplotlib inline"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "\n# Comparison of kernel ridge and Gaussian process regression\n\n\nBoth kernel ridge regression (KRR) and Gaussian process regression (GPR) learn\na target function by employing internally the \"kernel trick\". KRR learns a\nlinear function in the space induced by the respective kernel which corresponds\nto a non-linear function in the original space. The linear function in the\nkernel space is chosen based on the mean-squared error loss with\nridge regularization. GPR uses the kernel to define the covariance of\na prior distribution over the target functions and uses the observed training\ndata to define a likelihood function. Based on Bayes theorem, a (Gaussian)\nposterior distribution over target functions is defined, whose mean is used\nfor prediction.\n\nA major difference is that GPR can choose the kernel's hyperparameters based\non gradient-ascent on the marginal likelihood function while KRR needs to\nperform a grid search on a cross-validated loss function (mean-squared error\nloss). A further difference is that GPR learns a generative, probabilistic\nmodel of the target function and can thus provide meaningful confidence\nintervals and posterior samples along with the predictions while KRR only\nprovides predictions.\n\nThis example illustrates both methods on an artificial dataset, which\nconsists of a sinusoidal target function and strong noise. The figure compares\nthe learned model of KRR and GPR based on a ExpSineSquared kernel, which is\nsuited for learning periodic functions. The kernel's hyperparameters control\nthe smoothness (l) and periodicity of the kernel (p). Moreover, the noise level\nof the data is learned explicitly by GPR by an additional WhiteKernel component\nin the kernel and by the regularization parameter alpha of KRR.\n\nThe figure shows that both methods learn reasonable models of the target\nfunction. GPR correctly identifies the periodicity of the function to be\nroughly 2*pi (6.28), while KRR chooses the doubled periodicity 4*pi. Besides\nthat, GPR provides reasonable confidence bounds on the prediction which are not\navailable for KRR. A major difference between the two methods is the time\nrequired for fitting and predicting: while fitting KRR is fast in principle,\nthe grid-search for hyperparameter optimization scales exponentially with the\nnumber of hyperparameters (\"curse of dimensionality\"). The gradient-based\noptimization of the parameters in GPR does not suffer from this exponential\nscaling and is thus considerable faster on this example with 3-dimensional\nhyperparameter space. The time for predicting is similar; however, generating\nthe variance of the predictive distribution of GPR takes considerable longer\nthan just predicting the mean.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "print(__doc__)\n\n# Authors: Jan Hendrik Metzen <jhm@informatik.uni-bremen.de>\n# License: BSD 3 clause\n\n\nimport time\n\nimport numpy as np\n\nimport matplotlib.pyplot as plt\n\nfrom sklearn.kernel_ridge import KernelRidge\nfrom sklearn.model_selection import GridSearchCV\nfrom sklearn.gaussian_process import GaussianProcessRegressor\nfrom sklearn.gaussian_process.kernels import WhiteKernel, ExpSineSquared\n\nrng = np.random.RandomState(0)\n\n# Generate sample data\nX = 15 * rng.rand(100, 1)\ny = np.sin(X).ravel()\ny += 3 * (0.5 - rng.rand(X.shape[0]))  # add noise\n\n# Fit KernelRidge with parameter selection based on 5-fold cross validation\nparam_grid = {\"alpha\": [1e0, 1e-1, 1e-2, 1e-3],\n              \"kernel\": [ExpSineSquared(l, p)\n                         for l in np.logspace(-2, 2, 10)\n                         for p in np.logspace(0, 2, 10)]}\nkr = GridSearchCV(KernelRidge(), param_grid=param_grid)\nstime = time.time()\nkr.fit(X, y)\nprint(\"Time for KRR fitting: %.3f\" % (time.time() - stime))\n\ngp_kernel = ExpSineSquared(1.0, 5.0, periodicity_bounds=(1e-2, 1e1)) \\\n    + WhiteKernel(1e-1)\ngpr = GaussianProcessRegressor(kernel=gp_kernel)\nstime = time.time()\ngpr.fit(X, y)\nprint(\"Time for GPR fitting: %.3f\" % (time.time() - stime))\n\n# Predict using kernel ridge\nX_plot = np.linspace(0, 20, 10000)[:, None]\nstime = time.time()\ny_kr = kr.predict(X_plot)\nprint(\"Time for KRR prediction: %.3f\" % (time.time() - stime))\n\n# Predict using gaussian process regressor\nstime = time.time()\ny_gpr = gpr.predict(X_plot, return_std=False)\nprint(\"Time for GPR prediction: %.3f\" % (time.time() - stime))\n\nstime = time.time()\ny_gpr, y_std = gpr.predict(X_plot, return_std=True)\nprint(\"Time for GPR prediction with standard-deviation: %.3f\"\n      % (time.time() - stime))\n\n# Plot results\nplt.figure(figsize=(10, 5))\nlw = 2\nplt.scatter(X, y, c='k', label='data')\nplt.plot(X_plot, np.sin(X_plot), color='navy', lw=lw, label='True')\nplt.plot(X_plot, y_kr, color='turquoise', lw=lw,\n         label='KRR (%s)' % kr.best_params_)\nplt.plot(X_plot, y_gpr, color='darkorange', lw=lw,\n         label='GPR (%s)' % gpr.kernel_)\nplt.fill_between(X_plot[:, 0], y_gpr - y_std, y_gpr + y_std, color='darkorange',\n                 alpha=0.2)\nplt.xlabel('data')\nplt.ylabel('target')\nplt.xlim(0, 20)\nplt.ylim(-4, 4)\nplt.title('GPR versus Kernel Ridge')\nplt.legend(loc=\"best\",  scatterpoints=1, prop={'size': 8})\nplt.show()"
      ]
    }
  ],
  "metadata": {
    "kernelspec": {
      "display_name": "Python 3",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.6.9"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}